Mind the Gap: Essentially Optimal Algorithms for Online Dictionary Matching with One Gap
نویسندگان
چکیده
We examine the complexity of the online Dictionary Matching with One Gap Problem (DMOG) which is the following. Preprocess a dictionary D of d patterns, where each pattern contains a special gap symbol that can match any string, so that given a text that arrives online, a character at a time, we can report all of the patterns from D that are suffixes of the text that has arrived so far, before the next character arrives. In more general versions the gap symbols are associated with bounds determining the possible lengths of matching strings. Online DMOG captures the difficulty in a bottleneck procedure for cyber-security, as many digital signatures of viruses manifest themselves as patterns with a single gap. In this paper, we demonstrate that the difficulty in obtaining efficient solutions for the DMOG problem, even in the offline setting, can be traced back to the infamous 3SUM conjecture. We show a conditional lower bound of Ω(δ(GD)+op) time per text character, where GD is a bipartite graph that captures the structure ofD, δ(GD) is the degeneracy of this graph, and op is the output size. Moreover, we show a conditional lower bound in terms of the magnitude of gaps for the bounded case, thereby showing that some known offline upper bounds are essentially optimal. We also provide matching upper-bounds (up to sub-polynomial factors), in terms of the degeneracy, for the online DMOG problem. In particular, we introduce algorithms whose time cost depends linearly on δ(GD). Our algorithms make use of graph orientations, together with some additional techniques. These algorithms are of practical interest since although δ(GD) can be as large as √ d, and even larger if GD is a multi-graph, it is typically a very small constant in practice. Finally, when δ(GD) is large we are able to obtain even more efficient solutions. 1998 ACM Subject Classification F.2 Analysis of Algorithms and Problem Complexity
منابع مشابه
Mind the gap.
The Online Dictionary Matching with Gaps Problem (DMG), is the following. Preprocess a dictionary D of d gapped patterns P1, . . . , Pd such that given a query online text T presented one character at a time, each time a character arrives we must report the subset of patterns that have a match ending at this character. A gapped pattern Pi is of the form Pi,1{αi, βi}Pi,2, where Pi,1, Pi,2 ∈ Σ∗ a...
متن کاملOn Health Policy and Management (HPAM): Mind the Theory-Policy-Practice Gap
We argue that the field of Health Policy and Management (HPAM) ought to confront the gap between theory, policy, and practice. Although there are perennial efforts to reform healthcare systems, the conceptual barriers are considerable and reflect the theory-policy-practice gap. We highlight four dimensions of the gap: 1) the dominance of microeconomic thinking in health policy analysis and desi...
متن کاملGreedy Algorithm for the Analysis Transform Domain
Many signal and image processing applications have benefited remarkably from the theory of sparse representations. In the classical synthesis model, the signal is assumed to have a sparse representation under a given known dictionary. The algorithms developed for this framework mainly operate in the representation domain. Recently, a new model has been introduced, the cosparse analysis one, in ...
متن کاملQuality Gap in Educational Services at Zahedan University of Medical Sciences: Students Viewpoints about Current and Optimal Condition
Introduction. The first basic step in developing any quality improvement program is determining the quality gap and, adopting strategies for removing or reducing this gap. This study was performed to determine the quality gap in educational services at Zahedan University of Medical Sciences, based on students’ perceptions and expectations. Methods. In this cross-sectional descriptive study, 38...
متن کاملDictionary Matching with One Gap
The dictionary matching with gaps problem is to preprocess a dictionary D of d gapped patterns P1, . . . , Pd over alphabet Σ, where each gapped pattern Pi is a sequence of subpatterns separated by bounded sequences of don’t cares. Then, given a query text T of length n over alphabet Σ, the goal is to output all locations in T in which a pattern Pi ∈ D, 1 ≤ i ≤ d, ends. There is a renewed curre...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016